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ABSTRACT 


The purpose of this thesis is to investigate the 
practicality of the use of pseudo-machines for code 
generation. This approach essentially divides the 
translation of a source language program (PASCAL in this 
case) to a real machine into two steps. The first step is 
independent of the target machine; it compiles’ the program 
into a machine language of an imaginary machine. The second 
TES then takes this imaginary machine code and compiles it 
into machine code for a real machine. (Work has been 
completed which implements the machine independent first 
step, aideeitseis) now possible to utilize this compiler in 
constructing a set of PASCAL translators for different 


machines.) 


The design and implementation of a language translator 
for the PASCAL language using this approach are presented. 
A discussion of pseudo-machines and related work is 
presented and the language PASCAL is then briefly described. 
A detailed explanation of the architecture and instruction 
set of the PASCAL pseudo-machine follows. The implementation 
of a compiler is then described. In conclusion, the results 
of this investigation, as well as the effect on machine 
independence, portability and overall efficiency are 


discussed. 
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CHAPTER 1 


INTRODUCTION 


1.1 Pseudo-Machines 


The trend today, in the area of computer programming, 
is towards the development and utilization of higher level 
languages. Languages such as FORTRAN, COBOL, PL/I and ALGOL 
are enjoying widespread popularity in preference to the once 
popular machine level languages. The reasons for this 
popularity can be attributed to their comparitive ease of 
use and to the fact that programs coded in higher level 
languages are less prone to error than machine language 
programs. Higher level languages allow the programmer to 
focus his attention on the problem he is solving, rather 
than on the details of the machine language. 


However, this development in the area of programming 
languages has not been matched by a Similar development in 
the architecture of computer hardware, for which language 
translators for these languages must be written. The 
language translator writer is faced not only with von 
Neumann type hardware, (that is, without stacks), which is 
not ideally suited for many of these languages, but also 
with a lack of hardware standardization from manufacturer to 
Manufacturer, and even from model to model from the same 
manufacturer. One current solution to this problem is thata 
different language translator be written for each type of 
machine the language is to run on. However, eet 
immediately obvious that this) -approach is snoteivery 
satisfactory. It is very time-consuming, inefficient and 
quite often, certain features of the language under 
consideration are either implemented inefficiently, or are 
not implemented at all. Another solution is to design a 
language and write its compiler for only one machine, and 
ignore the rest. This approach is even less pleasing than 
the previous one. 


An approach becoming popular today, which solves part 
of these problems for the translator writer, is the use of a 
pseudo-machine, A pseudo-machine for a high level language 
is a hypothetical computer, the architecture of which 
strongly reflects the nature of the language for which it is 
being used to implement. The instructions of this machine 
are usually fairly sophisticated and represent many of the 
constructs within the higher level language. This pseudo- 
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machine now greatly simplifies the task of the translator 
writer. 


Once this pseudo-machine has been designed, a compiler 
is written for the higher level language. This compiler 
emits code from source programs of the higher level language 
im-}the instruction. .set, of..the. pseudo-machine. If the 
compiler is now rewritten in the language which it compiles, 
the compiler now becomes portable and machine independent. 
Unfortunately nowerealeamachines is), capable» of directly 
executing the pseudo-machine code. 


Several approaches, discussed in the next sections, are 
now possible. An interpreter may be written on an existing 
machine in an existing language. An actual machine may be 
built which executes the pseudo-machine code. An existing 
machine may be micro-programmed to execute the pseudo- 
machine code. Or, a compiler may be written which compiles 
this pseudo-machine code and emits machine code for a real 
machine. This final approach is the subject of this thesis. 
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1.2 Related Work 


ee ee ee ee 


Before proceeding with an examination of pseudo- 
machines for code generation, let us examine some previous 
work in the area of pseudo-machines. 


The idea of the use of pseudo-machines for translator 
writing is not recent. One of the original design papers on 
pseudo-machines was presented by Grau [Grau 1962]. In this 
paper, he presents a discussion of a pseudo-language, called 
L. The language was designed to execute on a stack machine. 
It was also demonstrated that this language could be used 
for translation for ALGOL and ALGOL-like languages. 


1.2.1 Interpreters 


One approach to executing the code produced for the 
pseudo-machine, is to write an interpreter. A program is 
written in an existent language, which simulates the actions 
of the pseudo-machine on a real machine. One of the earliest 
efforts in this direction was an intermediate language 
machine for ALGOL 60 written by van der Poel and simulated 
on the Zebra computer [van der Poel 1962]. 


Another of the early efforts was Randell and Russell's 
ALGOL 60 machine [Randell and Russell 1964] written for the 
KDF-9 computer. This work was the first extensive treatment 
of the problems of implementing ALGOL 60 on =the computers 
available at that time. . 


iiesst9oS,02 Glass, | Glass 1968] reported on SPLINTER, an 
interpreter written in FORTRAN for PL/I to run on the UNIVAC 
1108. The SPLINTER system emphasized debugging and 
diagnostic capabilities, exploiting the interpretive design 
to supply features difficult to obtain through traditional 
compilation-execution techniques. 


Further efforts were made towards the construction of 
intermediate machines for PL/I. A _ pseudo-machine first 
described by Goodfellow [Goodfellow 1968] was later extended 
by Pullam [Pullam 1969]. This machine features many address 
instructions, the use of self describing (typed) data and 
the use of stacks for storage. 


Sugimoto [Sugimoto 1969] also describes a PL/I machine. 
This machine -is based on a-. list structured machine 
organization and uses typed data and multiple address 
instruetions. noe 
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Woetian we iNOLthanes (972). presents),an interpreter,for.a 
pseudo-machine fOr a dialect of PL/I he: calls ‘Student PL. 
DiceematiemcearuLe: Of -this. effort dais (that =the’ use of 
scientific experimentation as a tool is examined for the 
design and evaluation of language directed computers. An 
initial design for a pseudo-machine is presented, followed 
by the experimental techniques used to measure the 
efficiency of this model. Based upon these measurements, a 
modified and improved pseudo-machine is obtained which is 
tailored more specifically to the language for which it is 
used. 


PONY) another pseudo-machine designed fer PL/I 
appears in [Boulten and Jeanes 1972]... The main feature of 
this system, called the PLUTO system, is its extensive 
diagnostic and dynamic debugging facilities, made possible 
by the use of a pseudo-machine. 


A recent language translator system has aiso been 
developed for processing the language PASCAL [Jensen 1973a], 
{Jensen 1973b] and [Ammann 1973]. The compiler was written 
in PASCAL and was translated into the intermediate machine 
code used by this system. This has resulted in portability 
for PASCAL, in that only an interpreter need ke written for 
the intermediate machine language, in order that one have a 
working PASCAL compiler. 
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1.2.2 Source Language Machines and Micro-Programmed Machines 


a —ae —_——- = oe 


The utilization of these machines for the execution of 
pseudo-machine code has not been as extensively investigated 
as interpreters. Perhaps the reason for this is the fact 
that machines are more difficult to design and micro-program 
than they are to program. However, several attempts have 
been made, and these are reported below. 


In 1961, Anderson [Anderson 1961] presented the 
architecture of a stack machine to execute ALGOL 50. Another 
stack machine, designed for the execution of FORTRAN 
statements, was presented by Melbourne and Pugmire [Melbourn 
and Pugmire 1965]. This machine was micro-programmed on 
another machine. Another FORTRAN machine was presented in 
1967 [Bashkow et al 1967]. This machine was designed to read 
in a FORTRAN source program, load it into core, almost as 
Hs, and execute it. 


Weber [Weber 1967], describes a micro-programmed system 
which implements EULER on =the IBM S/360 Model 30. This 
system consists of a translator which is a one-pass, syntax- 
driven compiler which translates EULER source language 
programs into a reverse Polish string form, and an 
interpreter, written in micro-code, which interprets the 
string language programs. It was demonstrated that an 
interpretive language can be executed efficiently by micro- 
programs on existing hardware. 


The design of hardware for a machine capable of 
executing APL was presented in 1970 [Thurber and Myrna 
1970}. A parallel processor was featured for matrix and 
vector operations. This machine was, for the major part, 
implemented via hardware, the remainder being micro~ 
programmed. Abrams [Abrams 1970] presented a more complete 
design of an APL machine which was based on a detailed 
mathematical analysis of the properties of APL. One of his 
principal results was a method for replacing vector and 
array operations by transformations on their descriptors. 
Another hardware implemented language, SYMBOL, is described 
by Chesley and Smith [Chesley and Smith 1971]. 


More recently, the IBM S/360 Model 25 has been used for 
a micro-programmed implementation of APL [Hassitt et al 
1973]. APL statements and functions are translated into an 
internal format which closely resembles the original. These 
statements and functions are then executed by a micro- 
programmed interpreter. 


Finatly, the Burroughs series of machines, beginning 
with the B5500, have been among the most successful "Algol" 
machines. The principal feature of these machines is their 
hardware stack which facilitates the task of implementing 
Algol-like languages. 
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1.3.1 Approach 


The aim of this thesis is to investigate the use of 
pseudo-machines for code generation. This idea is not new, 
in that any multi-pass language translator emitting code for 
a particular machine represents the original source program 
in some intermediate internal form or code for its 
successive passes. For example, we have the Source Language 
Machine of Wilcox [Wilcox 1971]. However, the emphasis in 
this thesis is on machine independence of the intermediate 
pseudo-machine and portability of the compiler which 
produces it. 


A language translator written using this technique now 
becomes divided into two distinct phases, as shown in Figure 
#1. The first, which we shall call the compiler step, 
performs all of the syntax and semantic checking for a 
source program. All variables are replaced by address 
couples, and statements are translated into the code of a 
pseudo-machine. Very briefly stated, the compiler removes 
all symbolic information (such as variable names) from _ the 
source program it compiles. In certain instances, where many 
data types are involved in the source program, tables are 
also generated which hold the type information for all the 
variables in the program. However, no storage is allocated 
for variables. Storage allocation is a machine dependent 
operation, and is left for the second phase. 


The second phase is called the code generation step. 
The code and tables generated by the compiler are input into 
this phase and code is emitted for a real machine. This step 
of the translator is written for gach type of machine for 


which code is to Le emitted. 


1.3.2 Merits of Approach 


There are several advantages to this approach to 
translator writing. One’ is that the compiler for the 
language for which the pseudo-machine has heen designed 
becomes portable and machine-independent. The pseudo-machine 
does Hote reity mipon the “guirks of-“specific hardware. 
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Implementation of a translator for this language on more 
than one machine becomes much easier because the use of a 
pseudo-machine allows a bootstrapping process. A compiler is 
written for the higher level langnage in an already existent 
language. The compiler is then rewritten in the language 
Which te Ccompites) and. the,” first «compiler. compiles -the 
rewritten version into the code of the pseudo-machine. The 
only step which reguires a complete rewrite, is the code 
generation step. The first step remains invariant from 
machine to machine. 


This type of implementation strategy is more efficient 
from the standpoint of problem program execution and storage 
allocation than the interpreter approach. The problem 
program is transformed into real machine code, which can 
execute more quickly than an interpreter. And the natural 
storage boundaries of the machine can be used, allowing for 
the compacting of storage. 


However, a slight disadvantage in this approach, when 
compared to a one-step translator, appears in the fact that. 
two steps must be executed to compile any program of the 
higher level language. Hence a higher overhead may be 
incurred than with a translator which goes directly to the 
code of a real machine. Also, an interpreter usually is 
Simpler to program than a code generator. 


Nonetheless, it is felt that the advantages of this 
approach to translator writing far outweigh any 
disadvantages. 
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We have applied this approach of employing a pseudo-~ 
HeciiiemP or Goode generation to producing a’ translator for 
the language PASCAL. The results obtained in this thesis 
are also applicable in constructing a language translator 
Bore ticsesubesystem Language [Clark 1971]. This 1s the. case 
because of the similarity of the two languages. Before 
presenting the details of this work, (the subject of 
following chapters), a brief discussion is now presented on 
the language PASCAL. 


Due to his deep dissatisfaction with present day mavjor 
languages, Wirth argues, in his Revised Report on PASCAL, 
that their " .. . features and constructs too often cannot 
be explained logically and convincingly and... too often 
represent an insult to minds trained in systematic 
Beasousng. st. | Wirth 1973]. Feeling that the disorder 
governing these languages imposes itself upon one's 
thinking, Wirth set forth two aims upon which he based the 
development of PASCAL. As stated by Wirth, these ares 


(1) " . . . to make available a language suitable to teach 
programming as a systematic discipline based on certain 
fundamental concepts clearly and naturally reflected by 
the language" 


(2). - - to develop implementations of this language 
which are both realiable and efficient on presently 
available computers". 


1.4.2 Language Description 


The following paragraphs comprise a very brief 
description of the PASCAL language with its data and control 
structures. For a more complete description, see [Wirth 
1973]. In presenting examples of PASCAL programs, we have 
adopted the capitalization rules of the SUE System Language 
[Clark 1971]. Very briefly, these state that all names 
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invented by the programmer appear with the first letter 
capitalized. 


A program in the PASCAL language consists of a program 
block. This prcegram block consists of a seguence of 
declarations which define labels, manifest constants, types 
and variables. This list is followed by a sequence of 
procedure and function definitions. Finally, these 
definitions are followed by a list of statements which make 
up the body of the program. Procedures and functions are 
defined recursively as program blocks. 


ee ee re ee 


Several basic data types exist in the language. These 
are integer, real, character, and boolean. The programmer 
also has the ability to define his own data types by 
specifying the symbolic names of all constants belonging to 
that type. For example: | 


type Operator=(Plus, Minus, Multiplies, Divides) ; 


Structured types are available in the language. These 
include arrays, records, pointers, sets and files. These 
structured data types are constructed using the above four 
basic types plus programmer types as the element types. More 
complex data structures can be constructed recursively. — 


File data types are structures consisting of a sequence 
of components which are all of the same type. Associated 
with the file is a buffer variable which contains the last 
component input or the component to be output. Files cannot 
be defined recursively. 


Variables may be declared to be of type <constant> .. 


<constant>. This defines a subrange of values which the 
variable may validly assume. 
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The control constructs of PASCAL allow both repetition 
and selection of statements. Repetition can be controlled by 
tests at either the beginning or end of the sequence of 
statements. The while <expression> do <statement> construct 
allows testing at the beginning of the sequence, whereas 
repeat <statement list> until <expression> allows testing at 
the end of the sequence. Rounded repetition is provided by 
the construct for <control variable> := <expression1> to 
<expression2> do <statement>. 


Selection 1s provided by the case and if then else 
constructs. The’ case construct selects a particular 
statement from a list of statements for execution. Each 
statement is labelled by at least one constant of the type 
of the expression in the case header. The if then else 
construct provides selection capability for a boolean 
expression. 


Finally, unconditional ‘jumps are made possible by the 
inclusion of the goto construct in the language. 


a 


ae 


a , pas. Te ~~ 3s es 
{gol ve> efiae i 7 
he ; | ¥ 
mi hs aie 5941 
- iz Io bas’ *¥ 3. 
fovz32009: 4 ~ 
<cackene ania 


7h te 


’ wry 7 
oe 
. 


3 
5) 


> 


~. 
?) 
— 


aii 
-~Liw 


CHAPTER 2 


THE PASCAL PSEUDO-MACHINE 


In this chapter, the architecture of an intermediate 


pseudo-machine used to implement PASCAL is described. The 
principal criterion used in the design of this pseudo- 
machine, was to facilitate code generation for real, 


commercial machines which exist to-day {for example, the IBM 
sao70 and =PDP— 11) . The initial design was based on a 
compiler-interpreter written by U. -Ammann and K. Jensen, 
{Ammann 1973] , [Jensen 1973a}] and [Jensen 1973b]. Their 
-compiler and pseudo-machine have been taken and modified to 
suit the purposes ai code generation. The major 
modifications included alteration of the method of storage 
allocation for variables and the method for accessing then, 
and the addition of a tange table to the output of the 
compiler. In order to produce a PASCAL compiler which emits 
code for our pseudo-machine, we decided to use as much of 
the existing PASCAL compiler as possible. This was done as a 
time-saving measure. 


A description of the architecture of the intermediate 
PASCAL machine is presented in detail in the following 
sections. Each component and its role in the pseudo-machine 
are described. Finally, the ideal machine language is 
introduced and the code sequence emitted for each PASCAL 
statement is given. 


The semantics of each instruction of the pseudo-machine 
are discussed independent of PASCAL, although we do not 
intend that these instructions be executed. We felt that 
this was the most convenient means of describing the pseudo- 
machine language. Also, by the presenting these semantics, 
one may define correctness of a program written in the 
pseudo-machine language independent of its representation in 
PASCAL. Hence, a proof of correctness for the compiler in 
its transformation of a PASCAL source language program toa 
pseudo-machine language program is now possible. If a PASCAL 
source program and the corresponding pseudo-machine language 
program produced by a compiler yield the same results, then 
the compiler was correct in its translation of that progran. 
This problem is not investigated in this thesis. 


Throughout the following discussion, intermediate 


machine and pseudo-machine are used interchangeably. The 
compiler and compilation step refer to the first step of the 
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language translatcr and the code generator 
generaticn step refer to the second step. 


Ah 


and 


ah. 


JiPelceoweroulcecture,; On the Intermediate Machine 


This section describes each component of the ideal 
pseudo-machine illustrated in figure 2-1. The overall 
Beuuctire. Oremy the machine is ;}that of a so-called "Algol- 
Machine"; that is, the predominant feature of the machine is 
its reliance upon the stack. This basic design was chosen 
because of the block structure and recursive nature of 
PASCAL. There are four major stacks, three of which have 
their own displays which delimit or mark off the various 
lexic levels within the stacks. There are also two tables, 
one of which contains the pseudo-machine ccde to be executed 
Or translated to "real" machine code. The other contains 
range information cf the various variables used throughout 
the executing program. Finally, two registers contain status 
information about the executing progran. 


The Pseudo-Machine Code Table contains the PASCAL 
pseudo-machine code emitted for the PASCAL program. This 
code consists of PASCAL pseudo-machine instructions with the 


following format: 
label op-code operandi, operand2 


In the above instruction, the label field may or may not 
appear, depending upon whether a branch is made to this 
instruction. The op-code is the mnemonic of an instruction 
of the PASCAL ideal machine. One or both of the operands of 
this instruction, operand! and operand2, may be missing 
depending upon the instruction. 


The Runes cACkmecourar mS the sblock “marks for the 
currently active procedures and functions. Each time a new 
scope block is entered, a biock mark with the following 
information is pushed onto this stack: 


return address 

save for displays 
statement register save 
dynamic link 

static link 

result (function) 


iiss. record contains the ‘status information of: the 
calling program. This includes the address of the first 
instruction to ke executed after control returns from the 
called procedure (return address), the status of the various 
displays at invocation time (save for displays), the 
contents of the statement register (statement register 
Save), and the dynamic and static links (dynamic link and 
static link). The dynamic link points back to the invoking 
procedure; the static link points to the scope level 
immediately surrounding the invoked procedure. The result 
Pielideease used for function invocations: the result of the 
function is stored in this field. Note that the returned 
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STATEMENT REGISTER 


PROGRAM COUNTER 


PSEUDO-MACHINE CODE 


RUN ——s* RUN LOCAL LOCAL 
DESRLAY SUSTACK « VARIABLE VARIABLE 
; DISPLAY STACK 


EXPRESSION EXPRESSION 
DISPLAY STACK 


RANGE 
TABLE 


Figure 2-1 The PASCAL Pseudo-Machine 
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Tesla OLeeceerunctroOn -in” PASCAT“may only be a* scalar ora 
subrange type. 


The Run Display is a stack which delimits the lexic 
PeveloeWwecninie tne ron stack, fhe i-th entry "points “to” the 
block mark for the last-invoked, currently active procedure 


atetex ces leveli1 


The Local Variable Stack contains the storage for all 
variables and parameters of the procedures and functions. 
The variables of a PASCAL program are replaced by the 
compiler by (lexic level, order number) address couples 
which map into storage locations on this stack. The actual 
allocation of storage for the stack (and the mapping of the 
address couples into locations in this ea) 152 Lerteto 
the code generation step. 


The Local Variable Display is a stack which delimits 
the lexic levels within the local variable stack. The i-th 
entry locates those variables at lexic level i which are 
legally accessible. 


The Nove Vartanle Stack contains . the storage) for 
variables allocated by the standard procedure new. The 
format of the entries in this stack is identical to that of 
the local variable stack. 


The Expression Stack is used in the evaluation of 
expressions. Each entry on this stack can hold any scalar 
item. All arithmetic and relational operations take place 
using the top elements of this stack. This stack is also 
used to hold the parameters of the standard procedures and 
functions. 


thes sexpression oPisphaysis.alstack™ used to delimit? the 
lexic levels within the expression stack. It is necessary in 
order to allow the clean-up of the expression stack if an 
exit is made from a function by the use of a goto statement. 
Tf in the middle of the evaluation of an expression, a 
function is invoked, and the invoked function exits by means 
of a goto, then it is possible that partially computed 
results may be present on the stack. This display is used to 
ensure that the valid portions of the stack are kept intact. 


In many pseudo-machines, one finds the preceding three 
stacks merged. However, if one or all of these stacks are to 
be implemented to be independent of the others, (for 
example, on the IRM S/370, the expression stack. could be 
implemented using the general purpose and floating point 
registers), then this separate definition of the stacks 
becomes helpful during code generation. Nevertheless, there 
is nothing preventing the implementation of this  pseudo- 
Machine on a real machine with two or three of the stacks 
combined, with a corresponding change in the displays. This 
@eciniti on appears flexible enough for HENS types of current 
‘hardware. 
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The Range Table contains the range information of all 
variables used in the program. Since the compilation step 
can perform ai type checking necessary (parametric 
procedures and functions are not supported), only range 
information for the variables need be kept available for 
code generation. This table also contains the subscript 
information necessary to perform array indexing and record 
subfield accessing. 


The range table, generated by the compile step and 
passed to the code generation step, has the following PASCAL 
definition: 


const Maximum_number_of_range_elements = 250; 


type Range_index_type = 0 .. Maximum_number_of_range_elements; 


Range types = (Integer_type, Real_type, 
Char_type, Pointer type, 
Array_type, Record_type, 
Set_type, File type); 


il 


Storage_types (Compacted, Not_compacted) ; 


Range_element 
record 
Storage: Storage_types; 
Subrange: boolean; 
Displacement: integer; 
case Variable type: Range_types of 
File type; 
Pointer_type: (Object_type: 
Range_index_typfe) ; 
Integer_type, Set_type, Char_type: 
(Lower_limit, Upper_limit: 
integer) ; 
Real_type: (Real_lower_limit, 
Real upper limit: real) ; 
Array_type: (Numher_of_dimensions: 
integer) ; 
Record_type: 
(Number_of subfields: integer) 
end; 


Range = array [ 1 .. Maximum_number_of_range_elements ]} 


of Range_element; 


var Range_table: Range; 


For all types, we record the storage type (compacted or 
net compacted), whether it is a subrange, its displacement 
Lo iteeice asesubtjield.of a) record;sand itstbasici datatype 
(integer,. real, character, pointer, file, array, record, or 
se. For pointer types, we require information regarding its 
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object type to supply information for the code generator 
when processing references to the standard procedure new. 
Reals, integers and characters, if they are subrange types, 
as well as sets, also have their lower and upper limits 
recorded. For sets, these limits represent the lower and 
upper limits of the base elements. The number of dimensions 
is recorded for an array type, and the number of subfields 
{including tag fields and variants) is noted for record 
types. 

The first four entries in the table are the four 
standard types: 


1 integer 
2 real 
3 boolean 
4 char 


The fifth and sixth entries contain the description of the 
Byes textm (packed “Lile@.of i char). othe first °° six-*entries 
follow. 


Range_table[ 1].Storage:=Not_compacted; { integer } 
-Subrange:=false; 
~Variable type:=Integer type; 


Range_table[ 2].Storage:=Not_compacted; { real } 
-Subrange:=false; 
-Variable _type:=Real_type; 


Range. table{[ 3}.Storage:=Not_compacted; { koolean } 
-Subrange:=true; 
-Variable type:=Integer type; 
-Lower_limit:=0; . 
~-Upper_limit:=1; 


Range_table[4].Storage:=Not_compacted; { char } 
| -Subrange:=false; 
-Variable_type:=Char_type; 


Range _table[{ 5}.Storage:=Compacted; { text } 
-Subrange:=false; 
-Variable_type:=File_type; 


Range_tablef 6 ].Storage:=Not_compacted; 
-Subrange:=false; 
-Variable_type:=Char_type; 

Each type defined in the user's program has a range 

number associated with it by the compiler. This range number 
is actually the index into the range tabie for that type. 


Entries are made into this table, for type definitions, 
' aS follows: 


(a) For programmer types, (for example, type Colour=(Red, 


Orange, Yellow, ...)), the range is recorded as an 
unpacked subrange of the integers with the lower limit 
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zero, and upper limit as the number of constants in the 
type minus one. 


(b) For subrange types the lower and upper bounds are 
recorded and the containing type is either real, for a 
real subrange, character for character subranges, or 
integer for all other cases. Booleans are recorded as a 
subrange of the integers. This is the case because 
PASCAL allows the comparison of boolean values for both 
equality and inequality. 


(c) File types are recorded as themselves. This is followed 
by an entry describing the component type of the file. 


(d) Set types are recorded by noting the lower and upper 
limits of the base type. 


(e) Pointer types are recorded by noting the range number 
of the object type. This information is necessary for 
references to the standard procedure new. 


(£) Arrays are recorded by noting the number of dimensions. 
This is followed by as many entries as there are number 
of subscripts, the i-th entry denoting the type of the 
i-th subscript. These are then ‘followed by an entry 
describing the element type of the array. 


(g) Record types are recorded by noting the number of 
subfields. This count includes the number of fixed 
subfields, plus one for the tag field if present, plus 
the number of variants. This is followed by entries 
describing each subfield, tagfield, and variant. 


The real machine storage requirements for each type can 
be computed at code generation time from the information 
contained in this table. The code generator, by maintaining 
a local variable stack which contains the attributes and 
range information of the variables, and an expression stack, 
which contains the attributes of the elements of 
expressions, at code generation time, can keep track of the 
displacements and lengths of the various variables used in 
the compiled program. 


Finally, there are two registers which are part of this 
pseudo-machine. The Program Counter contains the address of 
the next-to-be-executed instruction in the Pseudo-Machine 
Code Table. The Statement Register contains the number of 
the currently executing statement. This information is used 
in the production of run time error messages. 
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SCAL Pseudo-Machine Language 


The PASCAL Fseudo-Machine Language is presented in the 
following subsections. We will introduce the instructions, 
as required, in order to compile the various constructs and 
statements of PASCAL. 
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Variables are declared by the DECL instruction, which 
is discussed in section 2.3.3. For scalar variable access, 
two instructions are used: 


PUSA (level,order) {PUSh Address) This instruction 
pushes the address of the variable 
specified by the address couple 
(level,order) onto the expression 
stack. The address couple (1,n) 
refers to the n-th variable 
declared at lexic level l. 


PUSV (level,crder) (PUSh Value) This instruction 
pushes the value of the specified 
variable onto the expression stack. 


In order to access an element of an array, the address 
of the array must first be pushed onto the expression stack. 
In a Simple case, where the array is not a component of any 
cther structured variable, this may be accomplished by PUSA 
{level,order). (More complex cases will be demonstrated by 
example later). Once the address of the array is on the 
expression stack, individual elements are accessed by first 
pushing the subscript values (after evaluation, 1h 
necessary) onto the expression stack, and then by use of one 
of the following instructions: 


INDA n {INDex array Address) This 
instruction indexes the array whose 
address is the top element of the 
expression stack and stacks the 
address of the specified array 
element. The operand n specifies 
the number of subscripts on the 
expression stack that are to be 
used. 
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INDV n (INDex array Value) The value of 
the array element accessed is 
pushed onto the expression stack. 


Both of these instructions pop the expression stack nt1 
times before stacking the result. 


In a manner Similar to that of accessing an array 
element, subfields of records are accessed by first pushing 
the address of the record in which it is contained, onto the 
expression stack. The subfields are then accessed by use of 
ene of the following instructions: 


SFIA n (Sub FIeld Address) The address of 
the n-th subfield of the record 
whose address is the top element of 
the expression stack is pushed onto 
the same stack. 


SP Veen (Sub FIeld Value) The value of the 
n-th subfield of the record whose 
address is the top element of the 
expression stack is pushed onto the 
Same stack. | 


Preeepocn | instructions, the ‘address of the record is. first 
popped off the expression stack, before the address or value 
of the subfield is pushed on. The. AnStructtonseeloL 
accessing array elements and subfields or records are used 
recursively in accessing elements in records with arrays as 
subfields, and accessing subfields in arrays of records. 


The accessing of constants reguires its own set of 
anstructions. Variants of the following instruction are 
used: 


PCVx constant (Push Constant's Value of type x) 
The value of the constant is pushed 
onto the expression stack. 


fmt oe? above Instruction; xs-nust.bpe one of TV eRFES, CC oreBy 
These suffixes denote the type of the constant being 
accessed; they are defined as: I-integer, R-real, S-set, C- 
character constant of length one and B-boolean. 


Use of . the constant nil requires the following 
instruction: 


PCVN (Push Constant Value Nil) This 
instruction pushes the value of the 
constant nil onto the expression 
stack. 


Accessing character strings of length greater than one 
makes use of: : ; 
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PCAD constant | (Push Constant's ADdress of type 
pagked*® parrayaeqhetcharacrer) The 
address of the constant is pushed 
onto the expression stack. 


Finally, one further instruction is required for the 
accessing of the values of variables. This is: 


EVAL {EVALuate) This instruction 
replaces the address at the top of 
the expression stack by the 
contents of that location. 


This is used in the accessing of scalar variables which are 
objects of pointers and are allocated by the standard 
procedure new. This instruction is also required for 
accessing call by reference parameters, since the address of 
the variable is passed. 


2.3.2 Storing and Transfer of Values’ to Variables 


There are two methods for storing results from the 
expression stack into variables. These make use of the 
following two instructions: 


STOR (STORe) This stores the value 
located at the top of the 
expression stack into the location 
specified by the next to top 
element in the expression stack. 
These two elements are then popped 
off the stack. This method of value 
storing is used when an address 
calculation is necessary in order 
to access a location (for example, 
array elements). 


STOW (level,order) (STOW) This instruction specifies 
that the value located at the top 
of: the -expression stack >is to be 
stored into the location specified 
by the variable whose address 
couple is (level,order). The 
expression stack is then _ popred 
once. This is used when no address 
calculation is necessary in order 
to access the location (for example 

a scalar-variable). 
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Also necessary is an instruction for storing the result 
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function. Since 


BLOC wenks » associated WIth) sthat 


scalars may be the result of a 


function, the following instruction is sufficient: 


STFR level 


(STore Function Result) This 
instruction places the value 
located at the tcp of the 
expression stack into the block 
mark of the function currently 
executings® at Ptheticurrenteeilexic 
level. 


Finally, the movement of structured data is facilitated 
by the use of the following instruction: 


MOVS 


(NMOVe Structured) This instruction 
copies the record or array 
specified by the top element of the 
expression stack to the location 
specified by the next {to “top 
element of this stack. Both 
addresses are popped off the stack 
when the operation is complete. 
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2.3.3 Function and Procedure Linkage 
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With every invocation of a functicn cr procedure, an 
block mark is pushed onto the run stack, and -the various 
displays are updated. Each block mark has the following 
format described section 2.2. 


Note that this type of block mark and method of handling. 
procedure invocation does not allow parametric procedures 
asia. Lunctions. This is the case because this method does 
not record the environment of a procedure when it is passed 
as a parameter. 


This block mark is pushed onto the run stack by the 
fcllowing instruction: 


MKST n (Mark run STack) This instruction 
pushes a block mark onto the run 
stack. It sets both the dynamic and 
static links, saves the contents of 
the run, expression and local 
variable stack displays at level n, 
and the value of the statement 
register in this block mark. 


After the block mark has been pushed onto the run 
stack, the parameters to be passed are evaluated and are 
pushed onto the appropriate stack. For user procedures and 
functions, the parameters are pushed onto the local variable 
stack. For standard procedures and functions, a block mark 
is not necessary and the parameters {which may only be 
scalars) are pushed onto the expression stack. We do not 
require a block mark for standard procedures since they are 
not are not recursive. In order to accomplish parameter 
passing for user procedures, the following instruction is 
required: 


PARM n,kind (PARaMeter) This instruction 
allocates sufficient space on the 
local variable stack to contain a 
variable of type n, where n is a 
range number. The kind operand 
specifies whether the parameter is 
caiy by value, Or Care by 
reference. In the latter instance, 
the space reserved on the local 
variable stack contains the address 
of the specified variable. 

The parameters are now stored on the local variable 

stack in the manner described in section 2.3.2. 


Onte the parameters have been evaluated and placed onto 


the appropriate stack, the subprogram is called. Standard 
procedures are called using: 
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CSPR name (Cail Standard PRocedure) The 
standard procedure "name" is 
called. 


User procedures and functions are called using: 


CUPR entry-point (Call User PRocedure) This 
instruction first places the return 
address in the block mark. It also 
interchanges the values of the save 
for the displays with the display 
entries at level n, the level of 
the called progran, Witches 
provided by the MKST instruction 
which precedes it. It then passes 
COnTLoOimestO, the suser,) program, at 
location entry-point. 


Upon entry to the user program, the following 
instruction supplies the lexic level Qt the called 
procedures: 


LXLV level (Lexic Level) The following 
procedure will execute at lexic 
level "level". 


The following instruction provides the necessary 
information for the code generator to reserve storage for 
each local variable, on the local variable stack: . 


DECL On, Kind (DECLare) This instruction declares 
a variable whose range number is n. 
The operand n is a index into the 
range table. The operand kind may 
take on one of three values. It may 
specify that space Sect O be 
allocated for a locally declared 
variable. Or, it may specify that a 
variable is a call by value 
parameter, or a call by reference 
parameter. These latter two values 
act only to provide information to 
the code generator, but not to 
cause the allocation of space. 


One instruction of this type is emitted for each variable 
and parameter. All declarations are collected together and 
placed at the head of the procedure to which they are local. 


Upon completion of a user subprogram, control is 
returned to the invoking program by the use of the following 
Instruction: 


RETN type (RETurN) This instruction returns 
control to the calling procedure, 

and restores the displays and 

statement register. When typve=F 
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(the function return), the function 
result located in the block mark, 
iS) pushed .onto.. thes top) of «the 
expression stack. When type=P (the 
procedure return}, no result is 
pushed onto the expression stack. 
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2.3.4.1 Relational Operators 


Expressions in PASCAL consist of two-simple expressions 
surrounding a relational operator or ae single Simple 
expression. For the comparison of scalars, set and subrange 
types, the following instructions are used: 


EQLX (POuALaeor a type  x):-.The ~ taps two 
: elements of the expression stack 
are .checked for eguality. If the 
two elements are equal, the boolean 
true is pushed onto the expression 
stack after the two elements are 
popped off; otherwise, the boolean 
false is pushed on after the two 
elements are popped off. 


NTEX (NoT Equal of type x) The top two 
elements of the expression stack 
are checked for ineguality. The 
boolean result is pushed onto the 
stack after the two elements are 
popped off. 


In the above two instructions, the letter x is replaced by 
one of the following to indicate the type of the elements on 
the expression stack: I-integer, R-real, C-one character, T- 
sets, B-boclean, and P-pointers. 


Comparison of structured types, such as arrays and 
records, may be done using the following instruction: 


EQLS (EQuaL Structured) The structured 
variable whose address is the top 
element of the expression stack is 
compared to the structured variable 
whose address is the next to top 
element of the stack. Both 
addresses are popped from the stack 
and the result true is pushed onto 
the stack if the two variables were 


equal; the result is false 
otherwise. 
NTES (NoT Equal Structured) This 


instruction compares the structured 
variable whose address is the top. 
element of the expression ‘stack 
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against the structured variable 
whose.address is the next to top 


element of the stack GE 
inequality. If the two are unequal, 
the boolean true is stacked; 


Otherwise false is stacked. 


The following instructions implement the inequality 
Peper odmOl Ss <y ) Seg, o¢ OLSPASCALS 


LSTx (LeSs Than of type Myer This 
instruction compares the next to 
top element of the expression stack 
to the top element, and stacks the 
boolean true if the former is less 
than ‘the latter; false > is” stacked 
otherwise. 


GRTx (GReater Lad: Obeetype. sjyeeolhis 
instruction ‘stacks true if the next 
to top element of the expression 
stack is ‘greater than the top 


element; false is stacked 
otherwise. 
LEQx (Less than or EQual of type x) This 


instruction stacks true if the next 
to top element of the expression 
stack is less than or equal to the 


top element; false is stacked 
otherwise. 
GEQx (Greater than or EQual of type x) 


The boolean true is stacked if the 
next to top element of the 
expression stack is greater than or 
equal to the top element; 
otherwise, false is stacked. 


The above four operations pop the top two elements off the 
expression stack hefore stacking the boolean result. The 
letter x in the above instructions represents the type of 
the operands on the expression stack, and is replaced by one 
of I-integer, R-real, C-single character, or E-koolean, 


The only structured types on which relaticnal operators 
other than equals and not equals may be used are packed 
arrays of characters. These types require their own set of 
instructions to implement the inequality operations: 


Lots (LeSs Than Structured) This 
- instruction ccmpares the packed 

array of characters whose address 

is the next to top element of the 

expression stack against the array 

whose address is the top element of 

the expression stack. The two 
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Several 
manipulations. 


LINR 


RINL 


INNN n 


addresses are popped off the stack 
and the boolean result is pushed 
on; true is pushed on if the first 


array is less than the second 
array; otherwise, false is pushed 
on. 

(GReater Than Structured) This 


instruction performs similarly to 
LSTS petexcept= that) the inequality 
checked for is greater than. 


(Less~ than or EQual Structured) 
This instruction operates similarly 
LStS except that * the. janequality 
checked for is less than or equal. 


(Greater than or EQual Structured) 
The inequality checked foreesis 
greater than or equal. 


instructions are also included to handle set 
These are: 


(Left set INcluded in Right) This 
instruction checks whether the next 
to top element of the expression 
stack is a set included in the top 
element of the stack. Roth operands 
are popped from the stack before 
the boolean result is pushed on. 


(Right set INcluded in Left) The 
top element of the expression stack 
is checked for set inclusion in the 
next to top element cf the stack. 
The boolean result is pushed onto 
the expression stack after the two 
operands are popped off. 


{membership IN a set) This 
instruction checks whether the next 
to top element of the expression 
stack is an element of the set 
(whose range number is n) which is 
the top element of the stack. Both 
operands are popped off the stack 
before the Bowe oat result is pushed 
on. 
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When used with set operands, the -minus sign indicates 
set difference. The following instruction implements this 


use of Lt: 


DEBE 


(set DiIFFerence) The top two 
elements of the expression stack 
are replaced by their set theoretic 
difference. The result is the _ set 
difference formed by taking the top 
element of the stack from the next 
to top element. 


The language PASCAL also allows the construction of 
sets. This requires the following instructicn: 


GNSS n 


The union of 


(GeNerate Singleton Set) The top 
element of the expression stack is 
replaced by a set whose only 
element is this item. The operand 
nh is-a range-number which supplies 
the information regarding the type 
of the base elements in the set, 
for the code generator. 


‘sets and the boolean "or", both 


represented by the same operator in PASCAL, have the 
following machine instructions: 


UNIN 


LIOR 


Terns«. 117 -. PASCAL 


(UNION) The top two elements of the 
expression stack are replaced by 
their set theoretic union. 


(Logical Inclusive OR) The top two 
elements of the expression stack 
are replaced by their logical 
inclusive or. 


are one or more factors connected by 


multiplying operators. The multiplying operatcrs of PASCAL 
are multiplication, division, division with truncation, 
modulus, logical and and set intersection. 


Multiplication is 
instructions: 


MULI 


implemented by the following two 


(MULtiply Integer) The top two 
elements of the expression stack 
are replaced by their product. 


-—31- 


as? aol . vo “of Aaiw been 


§ «poy Oy “ie ve pan A - 
(OL 79 Bik at We; az - oe] Poms 
\ . - ' P . 
ie ¢ 7 
’ i 
ul oa 7 
e. * 7 
\ m 7 
' ? aan 
. a 
; a < 
y ‘ 
sa £ r : 
+ ' 
i, 
{ 
ao if 2Oa0 
¢ af a tii PS. 
: - 
‘4 
+e 
' 
‘4 
> Fon 
: pe Pepe 
Y 
\ y } 
3 arids 
i 
Ww 
- 
; 
_ 
é 
wf 
{ Fe! —< 
4 
. 
" 


o- & JASZAL at Saank A) a 
. i f Io7473 oc oa a+ Doaest | 
| ( ,softsoklighvion’ & in 
ede: hea Sas feotege: 20 Labo 

rea 


37a , ph notes Ligh tes ace 


ge eer 


MULR 


Real division 


SHSLcreuction: 


DIVR 


Integer 


Ay west OT 


{(MULtiply Real) The top two 
quantities on the expression stack 
which are both of type real, are 
replaced by their product. 


implemented with the following 


(DIVide Real) .tThe= next ston scop 
element of the expression stack is 
divided by the top element. The 
result is pushed onto the stack 
after the two operands are are 
popped off. 


(davpsion: with 2 truncation), is 


implemented by the following instruction: 


DIVI 


The multiplying 


Anstruction: 


MODD 


Finally, 


represented by 


instructions: 


INTR 


LAND 


(DIVide Integer) The next to top 
element of the expression stack is 
divided by the top element. Before 
the result is pushed onto the 
stack, the two operands are popped 
off. ! 


operator modulus has the following 


(MODulus) The top two integer 
elements of the expression stack 
are replaced by the remainder after 
dividing the next to top element by 
the top element. 


Hicersection ean « Logical @'and'- = both 
sane PASCAL operator, have the 


(INTeRsection) The top two elements 
of the expression stack are 
replaced by their set theoretic 
intersection. 


(Logical AND) The top two elements 


of the expression stack are 
replaced by their logical and. 
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Factors are the basic units in PASCAL expressions. 
These are variable identifiers, expressions enclosed in 
parentheses, array elements, record subfields, constants, 
function references and set expressions. The only operator 
applicable at this level is the logical "not". This operator 
is implemented by the instruction: 


LNOT {Logical NOT) The boclean element 
at the top of the expression stack 
is replaced by its logical 


complement. 
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In the following sections, we present the sequence of 
pseudo-machine instructions generated for each statement of 
PASCAL. 


2.3.5.1 Assignment Statements 


With the instructions that have been introduced thus 
far, it is possible to emit code sequences for assignment 
statements. This can be best shown by a number of examples. 
The following fragment of a PASCAL program illustrates the 
pseudo-machine code generated for assignment statements. The 
address couples and subfield numbers appear as comments next 
to the variables which they represent. The code sequences 
appear as comments next to the statements. 


PiU emNa col kee ral Layali... 10, lanuta } of integer: 
Index = 1..2; 


Var neearlayve( ls .5;7 1. .1>)° OF boolean; f (17.0) } 
Beware y Wels. on) OLearrayuenl: 7/3] of; boolean: 
fel Na} 


C: boolean; Gra 
D: record fo (17 3) 
E: integer; {BOR 
Pre aurayveleel.«.0) | oLernategers flat 
case G: Index of { 2-3 
1: (H: integer) ; tes} 
Dae y ered.) { 4 3 
end; 
X: real; {1 (1, 4}>3 
K: integer; {20{ 1), >) ant 
Wemarravelil 102) of t C1 ,o) 8} 
record 


M: integer; {20%} 
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Wid 


£00 


Ne eceal fost} 

end; 
T=erecord. GTAP’: 

Peainteger: {ie} 

Uemabnayy |) 1..110) |] ofsrea i fost 

end; 
Ree peat. x * tO TG Ey 
a eOLLAVG ile. ey Of bDOOLean;: PSC a eet 
begin 


Ghia, a: { PUSA (1,0) 


C:=B[2][3]; { PUSA 


K:=D.F[ 1]; { PUSA (1,3) 


K2=LEayeNs { PUSA (1,6) 


= be 


Mise Oe O(-05 |/ X05 { PUSrepatstipery 
1 


K:=R4(3,2): { PUSV (1,8) 


LES J.u:=1 { PUSA (1,6) 


2.3.5.2 Procedure Invocation 


Procedure invocations have been discussed previously in 


PSEC UPON, 2.6.5 +s 


2.3.5.3 If Then Else Statements 


Triseesctatement- has two .torms::.in -PASCAL. Tt 
formed with and without the else clause. In the 
instance, the statement form is: 


IF <expression> 
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can be 
rst 


THEN <statement1> 
ELSE <statement2> 


The code sequence emitted for this would be: 


»- evaluate <expression> 


FIMP LAR1 


\ <statement > 


UJMP LAR? 
LAB1 : 
‘* <statement2> 


LAB2 
: next <statement> 


where LAB1 and LAB2 are PASCAL pseudo-machine addresses. 
For this statement, two new instructions have been 


introduced: 


FIMP n : (False »JuMP) If the top;:of the 
expression stack contains the 
boolean false, then jump to 
location n;3 otherwise continue 
execution with the next statement. 
In any case, pop the expression 
stack once. 


UJMP n (Unconditional JUMP}. * Control. is 
unconditionally transferred to 
location n. 


In the second: form of the statement, the ELSE clause is 
omitted: 


IF <expression> 
THEN <statement> 


This results in the code sequence: 


- evaluate <expression> 


FIMP LAP1 


: <statement> 


LAB1 
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The case statement’ is used to select a‘ statement: for 
execution frcm a number of choices based on the value of a 
so-called "tag" expression. It has the format: 


CASE <expression> of 
<constant labels>: <statement1>; 
<constant labels>: <statement2>; 


<constant labels>: <statementn> 
END 


The code sequence emitted for this statement might be: 


» evaluate <expression> 


XJMP LAR 1-n 
LAB2.) - % 
“ <statement i> 


UIME LARn+2 
LAB3 . 
ae <statement2> 


UIMP LAEn+2 
LABY . 


- remaining <statement>'s 


UIMP LAEn+2 
LAB1 UJMP LAE3 
UIMP LAE17 


UDJIMP LAB22 
LABn+2 


The value of m in the above code sequence is the lower 
limit of the tag expression type. A new instruction has 
been introduced for this instruction: 


XJMP n (indexed JuMP) This instruction 
takes the value at the top of the 
- expression stack, adds this to the 
address specified by the location 
li; and then jumps there. The 
expression stack is popped once. 
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The while statement reguires no new pseudo-machine 
instructions for cempilation. It has the format: 


WHILE <expression> DO 
<statement> 


The code sequence which results from the ccmpilation of this 
statement is: 


LAB1 ° 
- evaluate <expression> 


FIMP LAR2 
<statement> 


UJMP LAE 1 
LAB2 


where LAB1 and LAE2 are pseudo-machine addresses. 


2.3.5.6 Repeat Statement 


The repeat statement alsc requires no new pseudo- 
“Machine instruction. It has the format: 


REPEAT <statement Tist> 
UNTIL <expression> 


This statement produces the code sequence: 
LAB e e 
. <statement list> 
- evaluate <expression> 
FJMP LAE 


where LAB is a pseudo-machine address. 
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The for statement can be of two forms: 


FOR <control variable>:=<expression1> TO <expression2> 
DO <statement> 


or 


FOR <control variable>:=<expression1> DCWNTO <expression2> 
DO <statement> 


The following sequence of pseudo-machine code is emitted for 


the first form of this statement: 
»- evaluate <expressiont> 


- evaluate <expression2> 


STOW <dummy variable> 
STOW <ccontrol variable> 
PUSV <dummy variable> 
PUSV <control variable> 


FJIMP LAB2 
LAB 1 : 
£ <statement> 


~PUSYV <dummy variable> 
PUSV <control variable> 
NTELI 
FUJMP LAP2 
PUSV <control variable> 
INCR 
STOW <control variable> 
UJMP LAE1 

LAB2 


In this code sequence, a new instruction has been 
introduced: 


INCR (INCRement) The top element of the 
expression stack is incremented by 
the value 1. 


He also note the inclusion of the use of a dumny 
variable. The compiler declares this variable as a local 
variable in order to save the value of <expression2>. The 
value of <expression2> canmot be stored on the expression 
stack, because of the possiblity that the loop may be exited 
using a GOTO statement. Hence, we require the dummy 
variable. 


-hOo- 


For the second form of the for statement, in which 
DOWNTO replaces TO, the preceding code sequence is produced 
With two changes: the GEQI instruction is replaced by a LEQTI 
instruction, and the INCR instruction is replaced by a DECR 


rnstruction. These new instruction, DECR, operates as 
follows: 
DECR (DECRement) This decrements the top 


element of the expression stack by 
1. 


2.3.5.8 With Statement 


The with statement opens the scope of a e record 
variable, and allows the access of subfields without the 
need for specifying the record name. It has the format: 


WITH <record variable list> 
DO <statement> 


Every time a subfield of a record in the <record 
variable list> is accessed, within the scope of the WITH 
statement, the base address of the record is first pushed on 
the expression stack, before evaluating the effective 
address. If the base address of the record requires 
computation -of - some sort (for example, pointer 
dereferencing), a new dummy variable, declared by the 
ccmpiler, is used to save this base address. The reason for 
not using the display for storing the base address of the 
record(s) in the <record variable list>, is the same as that 
for the FOR statement -- the possibility of a GOTO statement 
occurring within the WITH statement, branching to a 
statement outside of the scope of the WITH statement. 


ee ee a ee 


The goto statement allows the transfer of the flow of 
control to another point in the program. This transfer can 
ke made to one of: 

(a) a location within the current scope level 


(b) a location in an outer scope. 
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The former transfer can be implemented using the 
ListLuct.on: : 


UJIMP location 


The latter requires the emission of the instruction 
sequence: 


CLEN devel 
UIHNP location 


A new. instruction’ has ~been introduced to handle the 
second case of this statement: 


CLEN level (CLEaNup) This instruction restores 


the displays to the lexic level 
specified as the operand. 
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2.3.6.1 Standard Functions 


A number of standard functions are includ2d in the 
language PASCAL. All of these have corresponding machine 
instructions in the PASCAL pseudo-machine. The following 
machine instructions operate on the top element of the. 
expression stack, and replace it by the result of the 
application of the function: 


ABST (ABSolute value of Integer) This 
instruction replaces the integer at 
the top of the expression stack by 
its absolute value. 


ABSR (ABSolute value of Real) The real 
number at the top of the expression 
stack is replaced by its absolute 
value. 


SQRT (SQuaRe Integer) The integer at the 
top of the expression stack is 
replaced by its square. 


SQRR (SQuaRe Real) The real number at 
the top of the expression stack is 
replaced by its square. 


ODDD {ODD) If the integer at the top of 
the expression stack is odd, it is 
replaced by the boolean true; 
otherwise, it is replaced by the 
boolean false. 


SUCC (SUCcCessor) The top element of the 
expression stack is replaced by its 
successor. If no successor exists, 
the result is undefined. 


PRED (PREDecessor) The tor element of 
the expression stack is replaced by 
its predecessor. If no predecessor 

exists, the result is undefined. 


ORDD (ORDinal) The character at the top 
of the expression stack is replaced 
by the ordinal number representing 
it in the character set. 
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CHRR (CHaRacter) The ordinal number at 
the top of the expression stack is 
replaced by the character which it 
represents in the character set. 


EOFF {End Of File) The result is the 
hoolean true if the specified file 
is Ripe! end of file status; 


otherwise, the result is false. 


TRNC (TRuNCation) The top real element 
of the expression stack is replaced 
by an integer which is the integral 
part of the real number. 
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¢ Functions 


Also included in PASCAL are a group of standard 
arithmetic functions. All are implemented ky declared 
external procedures, and all operate in a manner similar to 
user defined procedures. This allows the user to define his 
owh versions of these routines. All expect one parameter of 
type real and all return a real result. They are invoked by 
the CSPR instruction described in section 2.3.3. 


SORT (The square root function) The 
returned result is the square root 
‘of the parameter. 


EXP (The exponential function) fThe 
returned result is the number e 
raised to the power of the value of 
the parameter. 


LOG (The natural logarithm function) 
The result is the natural logarithm 
of the parameter. 


SIN (The sine functicn) The result is 
the sine of the parameter. 


cos (The cosine function) The result is 
the cosine of the parameter. 


ARCTAN (The arctangent function) The 


returned result is the arctangent 
of the parameter. 
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2. 3.6.3 Standard Procedures 
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A number of standard, built-in procedures are also 
included in PASCAL... These are invoked by the machine 
instruction: 


CSPR name (CauL Standard PrRocedure) This 
instruction is described in section 
eras 


Besides the standard arithmetic functions described in the 
previous section, this instruction may also be used to 
invoke the following procedures: 


PUT (Output ato “file procedure) = This 
procedure appends the vaiue cf the 
buffer variable, whose address is 
the top element of the expression 
stack, to the file associated with 
be Ty 


GET (Input from file procedure) This 
procedure advances the current file 
position to the next component and 
assigns the value of this component 
to the buffer variable, whose 
address is the top element of the 
expression stack. 


OPENR (Reset file position procedure) 
This procedure resets the current 
file position of the file, whose 
address is the top element on the 
expression stack, to the beginning. 


OPENW (Delete file) This procedure 
discards the current value of the 
file variable, whose address is the 
top element of the expression 
stack. 


NEW (The new procedure) This procedure 
receives one parameter, a pointer 
to a type which is to te allocated, 
or two parameters, a pointer to a 
record type with variants, which is 
to be allocated plus a. parameter 
which specifies the initial value 
of the first tag field occurring in 
the record. The ccde sequence 
emitted for a reference to new with 
one a A is: 


PUSA <pointer variable> 
LENG n 
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HOLD this procedure, we have 
introduced arnew. instructions 


LENG n (LENGth of type n) 
This pushes the 
length of the type 
whose range number 
Ls Ny onto the 
expression stack. 
This type. 1s2othe 
object type of the 
pointer variable. 
The slengtne oc the 
type is determined 
in the code 
generation step. 


For the two parameter call to new, 
the above sequence of instructions 
is extended to initialize the tag 
field: 


PUSV <pointer variable> 
SETA n 

PCVI <constant> 

STOR 


(The pack procedure) This procedure 
copies an array whese address is 
the third from top element on the 
expression stack, to a packed 
array, whose address is the top 
element of the expression stack, 
for a length specified by the next 
to top element of this stack. 


(The unpack procedure) This 
procedure copies a packed array to 
an unpacked array, The source 


array address is the third from top 
element on the expression stack, 
the target array address is the 
next to top element and the length 
is the top element. 


(Input and advance rrocedure) This 
procedure assigns the value of the 
buffer variable, asseciated with 
tneestandard mein pultye tle seetor. the 
character parameter, whose address 
is the top element of the 
expression stack. The file is then 
advanced one component. . 
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WRITE 


(Output and advance procedure) The 
value of the character parameter, 
which is the top element of the 
expression stack, is assigned to 
the buffer variable associated with 
the standard output file. The 
value of the buffer variable is 
then appended as the next component 
of the file. 
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Finally, there are three instructions in the PASCAL 
pseudo~machine which perform internal operations. These are: 


So baat) (SeT Statement Register) This 
Anstruction . sets the statement 
register to the value n. 


CKRG n . {Check RanGe) This instruction does 
a range check on the top element of 
the expression stack, where n is 
the index into the range table. 
This instruction is emitted to 
perform range checking for subrange 
types and subscripts of arrays. 


STOP (STOP) This instruction causes the 
immediate halt of execution. It is 
the last instruction to be emitted 
for any program. 


In the preceding sections, we have presented a pseudo- 
machine for code generation and its application in compiling 
the various constructs of PASCAL. With the presentation of 
their semantics, we note that the pseudo-machine 
instructions may also be executed by an interpreter, as well 
as being used for code generation. A summary of these 
instructions appear in Appendix B. 
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CHAPTER 3 


GONPAILE RESTRUCTURE 


In this chapter we discuss the first phase, the 
compiler, of a language translator system for PASCAL using a 
pseudo-machine for code generation. The following sections 
describe the components of the compiler. 


3.1 General Overview 


— ee ee ee eee 


The compiter for the PASCAL language was written in the 
SUE System Language [Clark 1971}. A subset of the SUE 
Language, which has direct counterparts in the PASCAL 
Language, was used (see Appendix Ef). The SUE System 
Language was chosen because of its similarity to PASCAL 
(making the task cf rewriting the compiler much simpler), 
and the availability of a compiler for it. The purpose of 
this was to allow the compiler to be rewritten, in the 
future, in PASCAL and have the compiler compile itself. 
This would result in a machine independent version of the 
PASCAL compiler. : 


The compiler uses the top-down, recursive descent 
technique of parsing. Each non-terminal of the language has 
a recursive procedure associated with it. The procedure 
associated with a non-terminal U is invoked by the syntax 
analyzer when it is searching for the U. Error recovery is 
accomplished by simply skipping the source text until either 
the start symbol or final symbol of the terminal associated 
with the non-terminal is encountered. The rules of syntax 
used in this parsing technique for PASCAL are summarized in 
the syntax charts in Appendix A. 


As well as handling the parsing of the PASCAL source 
program, this section of the language translator system is 
also responsible for scanning, symbol table manipulation, 
and type checking. The output from this phase is the PASCAL 
pseudo-machine code, to which the original program has been 
transformed, plus a range table. This range table contains 
the information necessary for the code generator to 
determine the type and amount of SAGES EES! required by each of 
the variables used in the progran. 
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As owas stated before, in order to generate a PASCAL 
compiler in a reasonable amount of time, as much of the 
Amnmann-Jensen ccmpiler interpreter system was used as 
possible. The overall structure of our PASCAL compiler is 
essentially the same as the Ammann-Jensen compiler. The 
scanner, method of parsing and type checking routines were 
left intact. The symbol table routines were altered to 
provide the information necessary for the PASCAL pseudo- 
machine described in Chapter 2. This was required because we 
changed the address computation scheme. 


Much of the code emission routines suited our purposes, 
and their overall structure was adopted. However, changes 
were made to.enable us to generate code for our more general 
method of addressing variables. This has allowed the 
implementation of this compiler to progress more quickly 
than would have been possible had we decided to develop our 
own routines. 
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Been sche iymbol Table 


The symbol table is organized as an unbalanced binary 
tree at each lexic level. There is a display stack, 
consisting of pointers, which acts as the root of each tree, 
the n-th entry being the root of the tree for the procedure 
at lexic level n. When a new scope level is entered, 
triggered by the appearance of a procedure/function 
definition or a WITH statement, a new entry is pushed onto 
the display. A binary tree is then created, with the root 
at the display, which contains the variables for that scope. 
When the scope is exited, the display is popped. Figure 3-1la 
illustrates a symbol table at lexic level 2, organized in 
this fashion. 


Each binary tree depends on the order of the variable 
names which are entered. After the first variable is entered 
for a scope level at the root, succeeding entries are made 
at leaf nodes. If the name to be entered is lexically 
greater than a name at a node, the right link is chosen; 
otherwise the left link is chosen. The result of entering 
PrecmevarrLabies | Whose names are J, B, Z, F, I, L,..and-C, “in 
PisteOrucr, wiser. iiustrated’ in Figure 3-1b. 


Fach node of the binary tree is an entry for one 
variable, type, constant or praecedure/function name and 
contains the fcllcwing information: 


e We record the character representation of the name 
of the variable. 


° There: |is) aw left Vand asright link, to which are 
attached later entries to the symbol table. 


@ A link -points into the type stack: (described in 
the next section), . which ‘provides the type 
information of this variable. 


° A link is used to connect the subfields of a 
record, the names of the constants of a programmer 
type, Or the parameter names to the 


procedure/function name. 


e For manifest constants, the value of the constant 
is also recorded. 


° For variables, the lexic level and order number 


are recorded, as well as informaticn indicating 
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Figure 3-la Symbol Table. Using 
Unbalanced Binary 
Drees 


Figure 3-1b Tree After Entries 
Have Been Made For 
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whether this variable LS wad yCalig py  avalie 
parameter, a locally declared variable, or a call 
by reference parameter. 


eove subftieldsgsof ra &record; theefieldenumber. 2s 
recorded. 


For procedures and functions, we make an entry 
indicating whether ie is a standard 
procedure/function or a deciared one. Tf the 
procedure/function is standard, the internal key 
number is noted. For a declared procedure/function 
the lexic level, address and type cf procedure 
(parametric orgeactuad) garter recordedvielt tthe 
procedure is not parametric, then two boolean 
values indicate whether the procedure/function is 
external (the standard arithmetic functions) and 
is declared forward. We allow procedure and 
function names and parameters to be defined before 
their ktodies by the use of the fcrward construct. 
This permits recursive procedure invocations by 
brother procedures. 
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3.2.2 The Type Stack 


—_—_— —— 


The type stack is used to hold the type information of 
Pieevemtabiespangthedasynbolrsetabies\ Eachs.ofitsreentries 
contains the follcwing information: 


° A boolean is used to prevent cycles when printing 
out the contents of the type table. These tables 
are printed out at the end of the ccmpilation of a 
progran. 


° A pointer is used to index the .range table 
(described in section 2.2). 


e The type number of this type is recorded. 
° HOLDeesGaLeLrscper we —cCecord the: type; = (standard or 
declared}. If the scalar is declared, a link 


points back into the symbol table to its name. 


e For subrange types, we record the containing type, 
and also the minimum and maxinum values 


° Fortapointrers;, taraiank-Pto ethe object -«typeruis 
recorded. 


® Pore Seteatypes, A pointer to the’set ‘base type is 
recorded. 


® For arrays, a pointer to the array element type 
and a pointer to the index type, as well as _ the 
number of elements are recorded. 


e For records, a iliink points to the name of the 
first field in the symbol table. Another link 
points gxkorethelsfirstiavariant - if any)*whichsdis 
recorded in the type stack. 


@ Files have. a pointer to their component type 
recorded. 
° For tagfields, we record a pointer to the name of 


the tagfield in the symbol table. Another pointer 
links to the first variant, in the type table, for 
which this type is the tagfield. 


® Variants are noted by recording the value of their 
label. A’ pointer provides a link to the next 
variant, and another pointer. links? > toi* any 


subvariant which may occur within this variant. 
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3.2.3 The Constant Stack 


Integer, one character and boolean constants are 
recorded in the pseudo-machine instructions. The remaining 
constants are recorded ina constant stack. The entries in 
this stack hold the following information: 


2 we record the values of real constants and set 
constants. 
® for string constants whose length is greater than 


one, we record the string and its length. 


3.2.4 Exit Label List and Label List 


Exit labels are recorded by noting the lakel value and 
a pointer to the next exit label. The exit label construct 
of the PASCAL language is not supported in this version of 
the PASCAL compiler. However, the syntax of the label 
declaration is checked. 


The label list holds information pertaining to the 
local labels within a block. This information consists of 
the integer value of the label, the pseudo-machine address 
which it represents, a pointer to the next laktel entry anda 
boolean which indicates whether this label has been 
resolved. If the label has not been resolved, then the label 
address points to the last instruction which referred to 
this label. This instruction heads a list of instructions, 
chained together through the operand! field, which referred 
to this label. 


3.2.5 The Declaration and Pseudo-Machine Tables 


ee ee ee we et pee wre ee ee ee 


For each procedure or function, we supply a declaration 
table and a pseudo~machine code table. These tables contain 
the declarations of the variables of the procedure and the 
code into which the PASCAL source program has been compiled, 
respectively. The declaration table consists of a series of 
range numbers of the variables in this procedure, along with 
a field which indicates whether this variable has been 
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declared locally, is a call by value parameter, or a call by 
reference parameter. The code table consists of a series of 
instructions whose format is 


label op-code' operand1,operana2_ 


The label field may or may not be present, depending upon 
whether a branch is made to this instruction. Op-code is 
the numeric code of one of the PASCAL pseudo-machine 
instructions, and operand! and operand2 specify the required 
operands of the instruction. 
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3.2.6 The Organization of the Compiler 


—— 


The compiler consists of a sequence cf recursive 
modules which both parse and emit code for a PASCAL source 
program. Fach module, when called, is responsible for 
locating the construct which it compiles and for recovering 
in case cf error, to the point where the calling module may 
Convuinue correctly. 


The compiler begins by initializing the tables to 
contain the standard variables, types, constants, and 
procedures and functions. Processing of the PASCAL source 
program now begins at the block level. 


Exit label declarations, if any are present, are the 
first items to be processed. The label value is checked to 
be an integer, and if valid, is entered into the exit list. 
Manifest constants are then processed. Their names are 
entered into the symbol table, and their values, if non- 
integer, are entered into the constant stack. Type 
definitions are processed and are entered into the symbol 
and type tables. It is at this point that entries are also 
made into the range table for these types. Next, the 
variable declarations are processed. Entries are made into 
the symbol table for all names, and entries are also made 
Peo eecype and ~range tables if any new .types are 
implicitly defined. PASCAL pseudo-machine instructions are 
SaiIcoouratecnss Toint “for ~ the allocation of the , local 
variables. Finally, procedure and function definitions are 
handled. A new lexic level is entered for each, and the 
parameter definitions are processed. The rest of the 
procedure/functicn definition is handled by a recursive call 
to the block processing routine. 


The body of the program is now processed. The body 
consists of a sequence of statements. All of these 
statements (except the assignment and procedure reference 
statements) begin with a reserved word. There is one 
routine for each type of statement. If the statement is 
preceded by a label, then this is inserted into the label 
list and any unresolved references to it are corrected. 


Procedure function . references are processed by “a 
separate set. of routines. A check is first made as to the 
type of the precedure/function reference. If the type is 
standard, then the linkage code hot a standard 
procedure/function 1S! generated. Otherwise, if the 
reference is to a declared procedure, code is generated 
which invokes the user defined procedure, 


The expressions contained in the statements are 
processed by a series of recursive routines which handle 
expressions; simple expressions, which are the elements of 
expressions; terms, which are the elements of Simple 
expressions; factors, which are the elements of terms; and 
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Variables, which are the elements of factors. ( All of - these 
routines emit PASCAL pseudo-machine instructions for 
evaluating the construct which they compile. Type checking 
is accomplished by maintaining record variables which 
contain the type of the last element in the expression, 
simple expression, term and factor to be processed. 

The flow of control between these ections of the 
compiler is shown in Figure 3-2. 
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Figure 3-2 Flow of Control 
In Compiler 
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As the preceding sections indicate, the described 
PASCAL compiler will produce code and a range table from a 
PASCAL program, for our pseudo-machine described in Chapter 
2. This code and table may be now be input into the second 
phase of a translator system using a pseudo-machine for code 
generation, called the code generator. This step can now 
produce from this information, machine code for a real 
machine. ; 


The compiler has taken approximately five months to 
write and consists of approximately 8000 lines of SUE code. 
It -is felt that a code generation step, written for the IBM 
S/370, would take abeut three to four months to complete by 
a person who is familiar with the described pseudo-machine 
and the TBM-S/370. ; : 


And, now that the task of writing a compiler which 
converts PASCAL programs into the pseudo-machine code of the 
preceding chapter is completed, it need never be redone. 
This phase is machine independent. The only step which 
need be rewritten is the code generator, since this, of 
course, is machine dependent. We present an outline of the 
design of a code generator for the IBM S/370 in Appendix C. 
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CONCLUSIONS 


A pseudo-machine for code generation has been 
developed. The code emitted by the compiler described in 
Chapter 3 is presently being verified. 


In the following sections, we present the results of 
this investigation of using a pseudo-machine for code 
generation as an approach to translator writing. 


4.1 Wajor Features of the PASCAL Pseudo-Machine for Code 


This investigation has provided the features necessary 
in a pseudo-machine that is to be used for code generation. 
For PASCAL, these include a range table describing the data 
types present in the program. This table is used by the code 
generator when processing variable declarations. 


Another key feature of this pseudo~machine is the fact 
that no decisions regarding space allocation for variables 
are made in the compile step, but are left to the code 
generation step. This provides a great flexibility in 
allowing the pseudo-machine to be implemented on machines of 
different storage units (for example, word based machines 
and-byte based machines). The method of addressing variables 
relies on the address couple, where (l,n) refers to the n-th 
variable declared at lexic level 1, The mapping from order 
number to actual machine displacement is the task of .the 
code generator. 


Finally, another important feature of the  fpseudo- 
machine lies in the separation of the run,  lccal variable, » 
and expression stacks. This allows the code generator to 
allocate these stacks in the best- location cn the real 
machine for which it.is generating code. 
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4.2 Effect of Using a Pseudo-Machine £or Code Generation 


4.2.1 Qn Machine Independence 


This approach of using a pseudo-machine for code 
generation does indeed provide a machine independent means 
of writing a compiler. Because of its design (for example, 
no storage allocation decisions for variables, and separate 
stack locations), the resulting pseudo-machine is very 
flexible and adaptable to many types of hardware. Machine 
dependent operations, such as the allocation of storage for 
variables, and location of the stacks are performed at code 
generation time, when the characteristics of the host 
machine are known. 


As was stated in the previous chapter, the compiler was 
implemented using a subset of SUE which has counterparts in 
PASCAL. It is now an extremely simple task to recode the 
compiler in PASCAL. And once this has been done, the 
compiler coded in PASCAL could be compiled by the SUE 
versicn, producing pseudo-machine code and a range table 
which are both machine independent and portakle. The only 
reguirement now, for obtaining a running PASCAL compiler on 
any machine, is to write a code generator which converts the 
pseudo-machine ccde into the machine language of that 
machine. 


In general, if any ccmpiler, using a pseudo-machine for 
cede generation, is capable of compiling itself, then one 
has a portable piece of software. 


472.3 On Efficiency 
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Because the allocation of storage has been left to the 
code generator, we now have a means of uSing the storage 
units of any host machine, (for example, a byte on an IBM 
Soe OoLemae WOrd On a PDP~“10).. This, then, allows <a 
compaction of storage, and hence a saving over the 
conventional compiler-interpreter system. 


The separation of the stacks is ancther important 
efficiency consideration. This allows the code generator to 
place the stacks in optimal locations on the host machine. 
For example, on an IBM S/370, the expression stack could be 
implemented using the general purpose and floating point 
registers. 


Finally, real machine code executes more quickly than 
an interpreter can interpret intermediate code. Thus, if a 
program is to be used often, one need only save the output 
from the code generator. Peephole optimizers in the code 
generator could also improve the performance of its output. 


However, one may incur a higher overhead in translating 
programs using this approach. It is quite possible that 
these two steps could require more time and space on some 
machines, than would a one step translator that directly 
“produces the machine code of the host machine. However, we 
feel that the advantages of machine independence and 
portability gained by using this approach far outweigh this 
possible drawback. 


Using this technique, then, one could implement 
relatively quickly, language translators for a high level 
language for several machines. The translators can produce 
real machine code competitive with code produced by language 
translators written specifically for those machines. 


Thus, the overall effect of a pseudo~=machine for code 
generation on machine independence, compiler portability and 
efficiency aprears to be a highly desirable one. 


This technique, then, of using a pseudo-machine for 
-code generation appears to be a workable approach to 
translator writing. Enough flexibility is present in the 
pseudo-machine code to allow implementation cn many types of 
hardware. Once the compiler phase of the system which 
generates the pseudo-machine code is written, it need not-be 
redone. Only the code generator requires rewriting for each 
host machine. 


It is hoped that the results of this investigation will 
aid future development of machine independent, portable and 
etricient. software er-~- ayovital... issue in, thessworld’” of 
computers today. 
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APPENDIX A 


Syntax Charts for PASCAL 


The following pages contain the syntax charts for 
PASCAL. _These are taken from the Revised Report On PASCAL 


[Wirth 1973}. 


-66- 


4A ree fe 


, 
The 
~“ v - 
n bd - io 


109284 wi anes vudeek 


: i 


af 


~ - bd : 


10d atisdo xstaye odd ‘Mietab> seeped pnivolfot eat 


' a « 


IAD2@AG 40 410999 beakve @¢* moak 06467. ors seeat 
= 4 1 
’ ~cerver 


~? a 


O-=}+-O—O 


ad) atdwits 


Aornuapt 


Jayiquapt ody 


QS 


ISTT PIONS 


adj 


adh} alduits 


Jodajut paustsun 


Jaquinu paudtsun 


Jatjl,uapt ,UuUR}SUOD 


Rep teh a:1 fe) 


TIN 
Jaquinu paustsun 


JAYIWUIPT WUL}SUGD 


juejsuo0s 


jue}SuoOS psustIsun 


je) Jasgajut paudtsun 


dequinu paustsun 


Jagajyul peugtsun 


aaljtuapy 


[sorsnuapy aunatooud 


hie NOILONN 


Ao} Uapy 


JST] Jajawesed © 


uotssaadxa arduis 


uotssaadxo a[duits |j<———— 


uotssaudxa 


uojssaadxa o[duys 


uotssoudxs 


DIQEIAvA 


jur}suoo poustsun 


JO'JTUBPT Play ~C) 
HO 


Jatjyuept platy 


Jatstuapt ofqurara 


wWit3} 


10jde} 


DTQUIABA 


oer 


AAS 


7 ea 


iwi| wiemateq 


Oe eid 


Bear ce te 


C-—fieon) 


weusoid 


Jos yuapt 


pa} (2) [isin] Ga 


[ueisuoa|~—(=) IDYWUIpT LSNOO 


10390juUT poustsun 


yoo1q 


es n g 

i 

@ ma O ms /O me —— 
é 1 ' 

1°) 

eps mn CDs nO) e 


Jatjuapy] ainpasoid 


uotssoudxa 


uotssaudxa (=:) a[IQeliea 


juourayeys cs) JaJajuy paustsun 


datmuapt aanpaooud 


datyiuapt uonouny 


* 
wane 


{ 
an ™ 


a 


ro 


(7) ad ft 


ecrlees2g?D (%) 


APPENDIX P 


PASCAL Pseudo-Machine Instructions 


ABSI ABSolute value Integer 43 
ABSR ABSolute value Real 43 
ADDI ADD Integer 30 
ADDR ADD Real 30 
CHRR | Standard function chr 44 
CKRG n Check RanGe 48 
CLEN n CLEaNup 42 
CSPR name Call Standard 
PRocedure 25 
CUPR entry point Call User PRocedure 25 
DECL n, kind . DECLare 25 
DEGR - DECRement 41 
DIFF set DiFFerence 31 
DIVI DIVide Integer 32 
ob hse DIVide Real 32 
EOFF Standard function eof 44 
FQOLB EQuaL Boolean 27 
FQOLC EQuaL Character 27 
EQLI FQuaL Integer 27 
EQLP EQual Pointer 27 
EQLR EQuaL Real 27 
EQLS EQuaL Structured 27 
EQLT EQual seT 27 
EVAL EVALuate 22 
FJIMP location False JuMP 37 
FLOT FLOaT 30 
ya ie FLoaT second 30 
GEQR Greater than or ECual 


Boolean 28 
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Greater than 
Character 28 
Greater than 
Integer 28 
Greater than 
Real 28 
Greater than 
Structurede29 


GeNerate Sing 
Set 31 


GReater Than 
GReater Than 
GReater Than 
GReater Than 
GReater Than 


INCRement 40 


or ECual 
or EQual 
or EQual 


or EQual 


leton 


Boclean 28 
Character 28 
Integer 28 


Real 28 


Structured 29 


INDex array Address 20 


INDEX array V 


alue 21 


element INclusion 29 


set INTeRSection 32 


Logical AND 3 
LENGth 46 


Less than or 
Boolean 28 
Less than or 
Character 28 
Less than or 
Integer 28 
Less than or 
Real 28 

Less than or 
Structured 29 
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EQual 
EQual 
EQual 
EQual 


EQual 


Left set IN Right 29 


Logical inclu 


sive OR 31 


Logical NOT 33 


LeSs Than Boo 


lean 28 


LeSs Than Character 28 
LeSs Than Integer 28 
LeSs Than Real 28 
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constant 
constant 
constant 


constant 


constant 


constant 


Marek slack: 24 
MODulus 32 
MOVe Structured 23 


MULtiply Integer 31 
MULtiply Real 32 


NEGate Integer 30 
NEGate Real 30 


NoT Equal Boolean 27 
NoT Equal Character 27 
NoT Equal Integer 27 
NoT Equal Pointer 27 
NoT Equal Real 27 

Nof Equal Structured: 27 
NoT Equal seT 27 


Standard function odd 43 
Standard function ord 43 
PARaMeter 24 

Push Constant ADdress 22 


Push Constant Value 
Boolean 21 

Pysh Constant Value 
Character 21 

Push Constant Value 
Integer 21 

Push Constant Value 
Nil 21 

Push Constant Value 
Real 21 

Push Constant Value 
Sete sj. 


PREDecessor 43 


PUSh Address 20 
PUSh Value 20 


RETurN 25 


Right set INcluded 
in Left 29 


SubFIleld Address 21 
SubFIeld Value 21 


SQuaRe Integer 43 
SQuaRe Real 43 
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STOP 48 
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APPENDIX C 


Code Generation for the. IBM S/370 


In this appendix, we present a brief outline of the use 


cf the output of the compiler, described in Chapter 3, 
the generation of real machine code for the IBM S/370. 
of the details are adapted frcem the PO enaier ts on of 
SUE System Language [clark hay No optimizations 


attempted in this presentation. 


The following table provides the mapping from the 
types held in the range table to the storage units. on 


TBM S/370 


Range Table Type S/370 Storage Unit 


Integer Eudvord, (32. bits) 
Char Byte (8 bits) 
Real Doubleword (64 bits) 


Byte (8 bits) 
Halfword (16 bits) 
Fullword (32 bits) 


Subrange 0 toa 255 
$327671407382767. 
TW Rodi to 231-1 


Sets Fullword (32 bits) 
Pointers Fol lvorde(22.. 01.35) 
Files DCE + Buffer 


Call by Reference Fullword (32 bits) 
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Before any cede is generated for the S/370, the lengths 
of each data type in the range table are calculated in S$/370 
storage units, using the above table. Also, the relative 
locations of the subfields and variants within each record 


are calculated. 


C.1 Location of Stacks and Displays 


The expression stack is implemented using the general 
purpose registers starting from register 0, and working 
upwards, and all floating point registers. These will be 
eee ecm ert ri Sed pendix as AQ, Al,’ .:.:'s.., '.An,.* FAO, 
FAI, . . .- - Since we are saving all registers at procedure 
entry, we do not require an expression display. The run 
stack Saal the local variable stack are merged into one 
stack, known simply as the run Sacer aLomiay oe fOte this 
run Seas is .lecated in the general purpose registers, 
starting at register 12 and working downwards. These will 
Bemerererreds .O1ds DOF D1} % $5.3 Dn?-This display removes 
the need for the static links in the block marks. The new 
variable stack grows towards the run stack, and its pointer 
as located in acai 13. Registers 15 and 14 are used as 
yee registers for the program control sections. Figure C-1 


illustrates the layout. 


The procedure linkage mechanism presented here is a 


modified version of that found in [Clark 1971]. 
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But first, we present the format of the code at the 
heady cor cach™=procedure (d2 is a displacement’ of “an: entry 


pcint address of a module declared local to this module): 


Preliminary Code } 


Register 15-4 


OPE ORT 


The preliminary code checks for Sahhy overflow. The 
DELTA field supplies the length of the area required for 
local variables and parameters on the run stack. The list of 
EP's is a list cf the addresses of procedures nested - within 
this one. These are collected and stored here by the code 


generator. 


Tice touldt wrOLe ties ecitricser ons the run stack are as 
Po OvSm ese cirset, , the block Mmark, . consists: cf £our 


entries, each four bytes long. 


4 Fullwords (16 bytes) 


Below this is a 64 byte area which acts as the register save 
area. {Note that the called program is responsible for saving 


and restoring any floating point registers it may use, in the 
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local variable area.) 


16 [Register 0_| 
16 Fullwords (64 bytes) 
76 Register 15 


Below this is a &Y byte field which is the statement register 


for this procedure: 


80 Baa Register| 1 Fullword {4 bytes) 


Below this is the area for the local variables and parameters. 


84 


We make the following assumptions at the time of the 


procedure linkage: 
1) The call is from level n to level mn. 


z) The value of di must be such that Dn supplies 
addressability when registers and parameters are 
being stored. If this does not hold, we must 
compute d1(Dn) in a temporary tegister. (d1= 84 
+ length of storage in bytes required for local 
variables and parameters at level n = 84H + 


DELTA(n) ). 
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Save Registers and Set Up Block Mark 
The MKST instruction generates the following code: 


2M sit, a .1+16.(Dn) save registers 
ST Dn, 41+4 (Dn}j Sev dynanic Tink 


Obtain and Evaluate Parameters 


The PARM instruction informs the code generator of a 

new local ae er hich actually belongs to the called 
procedure. However, it is accessed shea an address 
couple which maps onto the stack location ‘d1 + 84 + 
length of previouS parameters, above the contents of 


Dn. 


Update Display 


The CUPR instruction generates the following code plus 


that, oretguand es. 


LA Dmn,d1(Dn) point to block mark 
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Load Register 15 With Entry Point Address 


On uplevel and brother calls, cnly, we emit the 
EosOutug onset cuction to -obtatngtthe saddresse of the 


called procedure: 
L 15, 12{Dn-1) get base address of proc. 


The following instruction loads the entry point into 


register 15: 


1 $5902 (15) point to called procedure 


Branch and Link to Called Procedure 
On procedure calls, the emitted instruction is: 


BALR 14,15 


Restore Registers on Return 


The following code restores the registers for a 


procedure return: 
LM 0,15,16 (Dm) 


If a function was called, indicated by the following 
instruction in the pseudo-machine code not being STSR 
or RETN, then we emit the following instructions (Ai or 
FAi is the register nen is the top free element of 


the expressicn stack): 


yf: 


Pu 


be 
{ nf ’ 4 
v = 
fa0° 4 
Wass * 
coat A ; 
? bi 7 
agi iv 
on Pal ‘ 
» 
7 a Sun 


Nhwat 
ee 
dd y 
a 
° | prt 
hi, 4 
< 
. c 
¢* 
2 
\ 
° , 4 
at , 
At & 
. 
5 
ea 


e 
a © 2 
ul ey =» 
i * 
> 
= 
¢? +c Nee 
EI pt S 
a 1 > 
sok Pouttaa 


oe 9 


teHs | Anes 


LM Opt, 16 (om) restore accumulators 


i Ai,O(Dm) return result 

LM Ai+1,15,4*i1+20(Dm). restore remaining registers 
or 

LD FAL,O(Dm) return result 

LM OMS, 1ot0n) restore registers 


c.4 Entry Sequence 


The following code appears at the head cf module: 


LR T,Da check for space 
A TRCELTACTHD) add regsirements 
o ty oeBORLINIT ro hideleeg ate ener ale: ly 
BNH OFFSET (15) enough room left when minus 
BAL 14,STACKOVERFLOW too little left 
DELTA stack length required. 


contained entry 
point addresses 


OFFSET STM 14,15,8(Dn) 
XR 0,0 zero out register 0 
ar 0,8&0(Dn) clear statement register 


establish further 
addressability, if 
necessary 


If register 14 was used for addressability, we must 


mest restore it: 
3 14,8(Dm) restore return register 


The following instruction implements the pseudo-machine 


PnNstruction RETN: 


BR 14 return to calling program 


-80- 


iw} 7 a 


(atl) By ea 


at 
( chia 
i] 
iy 4 d 
; 
- - 
bi 
i 
a 
? 
—* 
os, = 
‘ i 
: 
, 
- 
* +t 7 
¢ 
» 
1 ‘ 
‘ £ 
% 
‘ 
‘ 
\ 
: 
‘ 
1 
> to 
Z 
a. 
e 


Me 


tiuh oe 


ce © eae 


We now present code templates for the remaining 
instructions in our PASCAL pseudo-machine. We shall use the 


following conventions: 


1) Fimeisoetne. accunulator Bwhiich?,is) the top free 
element of the expression stack; Ai-1 is the next 


to top element, etc. 


2) even, odd is a free pair of general furpose 


registers used for division and multiplication 


3) T is a temporary register. 
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Pseudo~Machine Instruction S/370 Instruction Sequence 


ABSIT EPR Ai;AL 
ABSR LPDR AL, AL 
ADDI AR Ai-1,Ai 
ADRR AOR Al—1 Ai 
CHRR none required 
CKRG LR Ai+1,Ai 
S Ai+1,=F'tlower limit! 
BNM .*+2 
RAL 14,RANGEERROR 
i Ai+1,=Feupper Limit! 
SR’ Ai+t1,Ai 
BNM *+2 
BAL 14, RANGEERROR 
CLEN not implemented 
CSPR dependent on standard proc. 
CUPR see sections C.3 and C.4 
DECR BCTR2A1,0 
DIE XR Ai,Ai~1 
NR Fa 1 sAL 
DIVI LR even,Ai-1 
SRPA even, 32 
DR even,Ail 
LR Ai-1,o0dad 
DIVR DDR Ai-1,Ai 
EOFF TM 16 (Ai) ,X"'40? 
LA Ai,1 
BNZ *+t+2 
LR Ree AT 
EQLx HALAL BAS ae CR No Roy War Mak 
TA Aim 1 
BE *+2 
XR Ai-1,Ai-1 
EQLR COR SGA 17. FAL 
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Pseudo-~Machine Instruction $/3790 Instruction Sequence 


BE *42 
MR = DI=1,/R1-1 


EQLS | CLC .0 (length, Ai-1) ,0 (Ai) 
LA . Ai=1;1 | 
BE eo #42 


XR Ai-~1,Ai-1 


EVAL «hy L Ras OROR 2) 
Ons 
Baw ani (, At) 
Ors 
LR TAT 
OE ine i Fah 
room Ad, OCP} 
or 


LD  FAi,0{,Ai) 


FJIMP location HER. RIOD 
BZ location 
FLOT requires routine for 
conversion 
ELT 2 requires routine for 
conversion 
GEQx ped © ory Oa ee CR Ai-1,Ai 
LA Aan 171 
BNL *42 
XR Ai-1,;A1=1 
GEOR CDR FAI<-1; FAL 
LA AinmAs 1 
BNL *+2 


XR Ali~'1,Ai—4 


7 GEOS CLCe, OtLeng thpAa— 1) 7,0 (Az) 


LA Atel 
BNLi +» #22 


XR Ai-1,Ai-1 


GNSS UR Ait, Ai 
LA AY 1 
S Ait1,=Ftlower limit! 


SLL Ai,0(Ai+1) 


GRTx x=B,C,1 GR == Ad=17AR 
a ae LA or At = 


ata= 


“ 


Pseudo-Machine Instruction §/370 Instruction Sequence 


SHgame 2a 
XR. Ai~1,Ai-1 


GRTR CDR FAi~1,¥FAi 
DAS Ase 
BH +42 


XR Ai-1,Ai-1 


GRTS CLC O(length,Ai-1) ,0 (Ai) 
eo) SEAL 1 
BH *4#2 


XR Ai-1,Ai-1 


INCRE A Ai,=F' 1! 
INDA (one dimensicnal arrays only) 
LR even,Al 
S even,=F'lower limit! 
SRDA even, 32 
M even,=Ftarray element 
length! 
LA  Ai-1,0{(odd,Ai-1) 
INDY {one dimensional only) 
LR even,Ai 
S even,=Ftflower limit! 
SRDA even,32 
M even,=Ffelement length! 
and either 
L Bian, 0 (odd,Ai—1) 
or 
LH Ai-1,0(o0dd,Ai-1) 
Or 


LR T,Ai-1 

XR  Ai-1,Ai-1 

TC - Ai~1,0(cdd,T) 
Oe 

LD FAi-1,0(odd,T) 


INNN LR T,Ai-1 
S T,=F'lower limit! 
LA Ai-1,1 
SLL Ai-1,0({T) 
NR Ai,Ai-1 
XR Ai-1,Ai 
LA Ai-1,1 
RBZ *+2 
oO Ai-1,Ai-1 
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Pseudo-Machine Instruction S/370 Instruction Sequence 


INTR WR Aira, AS 
LAND piece Feet At 
LENG L Ai,=F'length!. 
LEOx x=B,C,1 Ghee 1A 

Ch Wane 

BNH *+2 

SR Aged As—1 
LEOR CDR FAi-1,FAi 

: [ie nt 1 
BNH *+42 


XR» Ai-1,Ai-1 


LEQS CLC 0(length,Ai-1) ,0(Ai) 
BAL Aa ie 
BNH *42 


XR Ai~1,,Ai-1 


LINR NR Ai,Ai-1 
XR Ai-1,Ai 
LA Ai-1,1 
BZ *+2 
XR  Ai=-1,Ai-1 


LIOR OR Ai-1,Ai 
LNOT LGR “Ai; Az 

A Rasa 
LST x X=P2,Cp1 CR Ai-1,Ai 

EA Ai-1,1 

BL *+2 

XR Ai-1,Ai-1 
LSTR CDR Ai-1,Ai 

LA Ai-1,Ai 

BL *+2 

XR Ai-1,Ai-1 
TESTS CLC O{length,Ai=-1) ,0 (Ai) 

A Ai-1,1 

BL *4+2 


XR ROS beg hey 
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Pseudo-~Machine Instruction 


MODD 


MOVS 


MULTI 


NTER 


NTES 


ODDD 


ORDD 


PARM 


PCAD 


cats Vi 


PCYVN 


PCVR 


PRED 


og 2 hey tom he AS 


Sob, oS 
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S/370 Instruction Sequence 


LR even,Ai-1 
SRDA even, 32 

DR even,Ai 

LR Ai-1,even 
MVC O(Length,Ai-1) ,0 (Ai) 
LR even,Ai-~1 
SRDA even, 32 

MR even,Ai 

LR Ai-1,0dd 
MDR Ai-1,Ai 
LCR Ai,Ai 

LCDR Ai,Ai 

CR FANS Noes (UPS a 

LA Agee 

BNE *+42 

XR Ai-1,Ai-1 
GDR @ FATS FAY 
TAvs Ai-1, 1 
BNE *+#2 

XR Ai~1,Ai-1 
CLC sO{length, Ai-1) ,0{Ai) 
LA Ai-1,1 

BNE *+#2 

XR Ai-1,Ai-1 
N Ai,=F' 1! 
none required 
see section C.3 
Ty3N ATe=6% 5.7% 
¥ ALS = ELAM. 
i Al,=X'FFFFFFFF? 
LD Ais=YIe. 


BCTR- Aa; 0 


GRC) 
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Pseudo-Machine Instruction 


S/370 Instruction Sequence 


PUSA 


PUSV 


RETN 


RINL 


ars 


SEe 


vay ER ae Ss 


rl ORE 


STOR 


ate es 


LD 


Ai, displacement (, Dn) 


Ai, displacement (, Dn) 
or 

Ai, displacement (, Dn) 
Or 

RIZAL 

Ai, displacement (, Dn) 
Or: 

FAI, displacement (,Dn) 


see section C.5 


NR 
XR 
LAs 
BZ 
XR 


LA 
L 

LH 
LR 


XR 
LG 


Ai-1,Ai 
Ai,Ai~1 
Ai=V1 
*+2 
Ai-1,Ai-1 


Ai,displacement (,Ai) 


Ai,displacement(,Ai) 
or 
Ai,displacement({,Ai) 
or 

T AL 

AVAL 
Ai,displacement(,T) 
Or . 
FAI,displacement (,Ai) 


Ai,0€,Dn) 


Ai,0(,Dn) 
or 
FAi, 0 (;, Dn) 


14, STOP 


Ai, 0 (,Ai-—1} 
or 
Ai,0(,A1-1) 
OG : 
Ai, Oto Ri 
Orr 
FAi,0(,Ai-1) 


= 


Pseudo-Machine Instruction S/370 Instruction Sequence 


STOW oft Ai,displacement(, Dn) 
Or 
STH Ai,displacement(, Dn) 
or 
STC Ai,displacement(, Dn) 
or 


STD FAi,displacement (,Dn) 


STSR MYT e2'(Dmiie kts et 
MVI 3(Dm),X'..! 


SORT LR even,Ai 
SRDA even, 32 
MR even,odd 
LR Ai,odd 


SORR a MDR Ai,Ai 

SUBI a Se 

SUBR Se aie te ae 

Shh Be Oe A AlL,=FY 1! 

TRNC routine fcr conversion 
UJMP location B location 

UNIN OR Ai-1,AL 

XJMP location B location(,Ai) 
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APPENDIX D 


Unsupported Features of PASCAL 


The following features of PASCAL are not supported by 


the compiler described in Chapter 3. 


° parametric procedures 
e exit labels and GOTO's leading out of scopes 
° files 


The following features of PASCAL are not supported by 


the code generator described in Appendix C. 


e multi-dimensional arrays. This is not limiting 


Since arrays of arrays are allowed. 


® the packed attribute and the. standard procedures 


pack and unpack. 
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APPENDIX. FE 


Subset of SUE Utiliyzed 


Only certain features of the SUE System Language which 
have direct counterparts in PASCAL were used in writing the 


compiler described in Chapter 3. The major features are: 


® TiewecyGle . Sxl t. End “construct iwas limited: to 


replacing the while do and repeat until constructs 


— 


of PASCAL. 
e Records and arrays were used. 
® Macros were defined and stacks allocated to 


Simulate the new standard procedure of PASCAL. 


e Sets were not used, since the maximum number of 
elements they may contain is machine dependent. 


Instead, arrays of booleans were used. 


° All charactér variabte were limited to arrays of 


character(1). 


e The do end control construct was used. 


a ae ee es 


end were used, 
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